Classification of speech under stress using target driven features
نویسندگان
چکیده
Speech production variations due to perceptually induced stress contribute significantly to reduced speech processing performance. One approach for assessment of production variations due to stress is to formulate an objective classification of speaker stress based upon the acoustic speech signal. This study proposes an algorithm for estimation of the probability of perceptually induced stress. It is suggested that the resulting stress score could be integrated into robust speech processing algorithms to improve robustness in adverse conditions. First, results from a previous stress classification study are employed to motivate selection of a targeted set of speech features on a per phoneme and stress group level. Analysis of articulatory, excitation and cepstral based features is conducted using a previously established stressed speech database (Speech Under Simulated and Actual Stress (SUSAS)). Stress sensitive targeted feature sets are then selected across ten stress conditions (including Apache helicopter cockpit, Angry, Clear, Lombard effect, Loud, etc.) and incorporated into a new targeted neural network stress classifier. Second, the targeted feature stress classification system is then evaluated and shown to achieve closed speaker, open token classification rates of 91.0%. Finally, the proposed stress classification algorithm is incorporated into a stress directed speech recognition system, where separate hidden Markov model recognizers are trained for each stress condition. An improvement of + 10.1% and + 15.4% over conventionally trained neutral and multi-style trained recognizers is demonstrated using the new stress directed recognition approach. Apache Helikopter Cockpit, wiitend, klur, Lombard Effekt, hut, etc.) ausgew'ahlt und in ein neues stressklassifizierendes neuronales T Corresponding author. R&urn6 Les variations dans la production de parole dues au stress induit contribuent de mar&e significative 5 la reduction des performances des systemes de traitement de parole. Pour estimer ces variations, une approche consiste 'a Ctablir une classification objective du stress du locuteur, basee sur le signal acoustique. Cette etude propose un algorithme pour l'estimation de la probabilite du stress induit. Le taux de stress predit par cet algorithme peut &tre integre dans des algorithmes de traitement de parole afin d'augmenter leur robustesse dans des environnements difficiles. Les resultats d'une etude precedente sur la classification du stress sont d'abord utilises pour selectionner un ensemble de parametres de parole relatifs au phon&me et au type de stress. Une analyse des parametres articulatoires, d'excitation et cepstraux est conduite sur une base de don&es de parole sous stress (" Speech Under Simulated and Actual Stress " (SUSAS)). Les parametres sensibles …
منابع مشابه
Phoneme Classification Using Temporal Tracking of Speech Clusters in Spectro-temporal Domain
This article presents a new feature extraction technique based on the temporal tracking of clusters in spectro-temporal features space. In the proposed method, auditory cortical outputs were clustered. The attributes of speech clusters were extracted as secondary features. However, the shape and position of speech clusters change during the time. The clusters temporally tracked and temporal tra...
متن کاملClassification of emotional speech using spectral pattern features
Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...
متن کاملImproving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms
One of the important issues in speech emotion recognizing is selecting of appropriate feature sets in order to improve the detection rate and classification accuracy. In last studies researchers tried to select the appropriate features for classification by using the selecting and reducing the space of features methods, such as the Fisher and PCA. In this research, a hybrid evolutionary algorit...
متن کاملNonlinear feature based classification of speech under stress
Studies have shown that variability introduced by stress or emotion can severely reduce speech recognition accuracy. Techniques for detecting or assessing the presence of stress could help improve the robustness of speech recognition systems. Although some acoustic variables derived from linear speech production theory have been investigated as indicators of stress, they are not always consiste...
متن کاملClassification of speech under stress based on features derived from the nonlinear Teager energy operator
Studies have shown that distortion introduced by stress or emotion can severely reduce speech recognition accuracy. Techniques for detecting or assessing the presence of stress could help neutralize stressed speech and improve robust-ness of speech recognition systems. Although some acoustic variables derived from linear speech production theory have been investigated as indicators of stress, t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Speech Communication
دوره 20 شماره
صفحات -
تاریخ انتشار 1996